Categorization of tinnitus listeners with a focus on cochlear synaptopathy

Tinnitus is a complex and not yet fully understood phenomenon. Often the treatments provided are effective only for subgroups of sufferers. We are presently not able to predict benefit with the currently available diagnostic tools and analysis methods. Being able to identify and specifically treat sub-categories of tinnitus would help develop and implement more targeted treatments with higher success rate. In this study we use a clustering analysis based on 17 predictors to cluster an audiologically homogeneous group of normal hearing participants, both with and without tinnitus. The predictors have been chosen to be either tinnitus-specific measures or measures that are thought to be connected to cochlear synaptopathy. Our aim was to identify a subgroup of participants with characteristics consistent with the current hypothesized impact of cochlear synaptopathy. Our results show that this approach can separate the listeners into different clusters. But not in all cases could the tinnitus sufferers be separated from the control group. Another challenge is the use of categorical measures which seem to dominate the importance analysis of the factors. The study showed that data-driven clustering of a homogeneous listener group based on a mixed set of experimental outcome measures is a promising tool for tinnitus sub-typing, with the caveat that sample sizes might need to be sufficiently high, and higher than in the present study, to keep a meaningful sample size after clustering.


Introduction
Subjective tinnitus, the perception of a phantom sound in the absence of an external stimulus, is a complex phenomenon whose causes and mechanisms are not yet completely understood. Due to a lack of standardization to assess tinnitus and due to multiple definitions of the phenomenon, the real prevalence of tinnitus is largely unknown. Studies report prevalence ranging from few percent to around 30% [1]. The American Tinnitus Association estimates that around 15% of Americans have subjective tinnitus, which, differently from objective tinnitus, is not related to blood flow or muscular-skeletal mechanisms [2]. Despite this, tinnitus research is still very inconclusive, and treatment outcomes are, in many cases, only placebo effects [3]. One of the challenges is that much of the current literature on tinnitus sufferers include listeners with tinnitus and a hearing loss [4], which potentially complicates the interpretation of the results. With respect to the mechanisms, hearing loss can roughly be divided into sensorineural hearing (SNHL) loss and conductive hearing loss (CHL). SNHL is mainly defined as reduced audibility due to damage of the inner ear or the auditory nerve and attributed to aging or induced by noise. CHL is related to problems in the transmission of the sound either in the external ear (e.g cerumen impaction) or in the middle ear (e.g fluid presence). Regardless of the type of hearing loss, tinnitus can also be present. Moreover, hearing loss and tinnitus are also two possible symptoms of Meniere's disease and acoustic neuroma [5]. In addition, in human studies the otologic background of each subject can be difficult to assess. A characterization of tinnitus in homogeneous subgroups would be helpful to understand the underlying mechanisms behind the issue and proceed to the development of focused treatments [3].
One approach to shed light on the mechanisms underlying tinnitus is to identify biomarkers that are independent of any subjective reporting of the listener. The results of different studies aimed to find biomarkers for tinnitus are inconsistent and often suggest opposing theories to explain the observed phenomena. However, a recurrent hypothesis associates tinnitus with an increased neural firing that could be either more centrally located or in more peripheral locations [6]. Non-invasive electrophysiological and imaging techniques like electroencephalography (EEG) [7], auditory brainstem response (ABR) [8][9][10], magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI) [11,12] and positron emission tomography (PET) have been widely used to try to test this hypothesis and localize the source of the tinnitus.
One potential mechanism for reduced input into the brain is the deafferentation of auditory nerve fibers (cochlear synaptopathy, CS) in the inner ear. This might lead to a compensatory increase in neural gain, which itself could cause tinnitus [13]. Numerical simulations have shown that CS might lead to overrepresentation of specific frequencies which might be one mechanism underlying tinnitus [8]. One of the main assumptions underlying this appraoch is that a noise trauma triggers the subsequent degeneration of spiral ganglion neurons (SGN) (Fig 1). This degeneration will not immediately be visible in the audiometric thresholds, but The assumption is that a noise trauma triggers a temporal threshold shift, accompanied by cochlear synaptopathy. With time, the thereby reduced input into the brain stem leads to tinnitus and to spiral ganglion neuron (SGN) degeneration. Progressive effects of synaptopathy and SGN degeneration lead to high-frequency hearing loss under the assumption of higher vulnerability of the basal part of the cochlea. The same timeline might apply to age-related cochlear synaptopathy and shortened by noise overexposure.
https://doi.org/10.1371/journal.pone.0277023.g001 first be evident after a critical time and after a critical amount of SGN degeneration has been reached [14,15]. It has also recently been shown in human temporal bones, that synaptopathy highly correlates with age, suggesting it being omnipresent also in normal hearing listeners [16].
A recent study showed that tinnitus sufferers had lower middle ear muscle reflexes (MEMRs) compared to a control group [17]. This suggested that the presence of tinnitus might be related to the functioning of the efferent system which, in turn, might be connected to CS. The use of MEMR as a measure of synaptopathy-related tinnitus is particularly promising both for being fast and for being a physiological objective biomarker. However other studies showed no connection between MEMR and tinnitus [18].
Besides physiological measures, behavioral measures might contribute to identifying subgroups of mechanisms underlying tinnitus.
A combination of assessing tinnitus frequency, tinnitus loudness, psychophysical tuning curves (PTC) and tinnitus tuning curves (TTC) revealed differences in a group of tinnitus listeners [19]. This variability potentially provides information about the tuning properties of the system and thereby the potential mechanism underlying tinnitus. As for synaptopathy, high frequency audiometry (HFA) has been selected as one proxy of the presence of synaptopathy as animal models showed a higher prevalence for synaptopathy at tonotopic places connected to high frequencies [20]. Finally, adaptive categorial loudness scaling (ACALOS) was reported to provide indications about the presence of hyperacusis, that has been suggested to be an important confound to take in account [21].
To develop a screening procedure for tinnitus which allows tinnitus subtyping, the selection of effective indicators of tinnitus and synaptopathy is critical. At the same time, the screening should be testing different parts of the system while not being too extensive in time to avoid effects of fatigue.
The present study attempts to categorize a subclass of tinnitus sufferers with normal hearing thresholds. A screening procedure was developed, consisting of psychophysical and physiological measures hypothesized to be sensitive to the presence of tinnitus and cochlear synaptopathy.
The data collected were used as input for a clustering algorithm. The objective to apply a clustering algorithm were to check if the algorithm is able to discriminate between tinnitus and a control group, and to evaluate if application of this algorithm can identify subgroups in the group of tinnitus sufferers.

Participants
Twenty listeners suffering from tinnitus participated in the experiment (tinnitus group, mean age 32.2 years, 6 females). An age (+/-4 years, mean 0.73, std 1.34) and audiogram (+/-10 dB per each frequency) matched group of 15 normal hearing listeners (30.2 mean age, 6 females) was included as control (control group). Inclusion criteria for the tinnitus group were a) non pulsatile, subjective persistent tinnitus (i.e. not transient as related to a temporal threshold shift or upper respiratory tract infection); b) audiometric thresholds not higher than 20 dB HL for the for the following audiometric frequencies: 125, 250, 1000, 2000, 3000, 4000, 6000, 8000 Hz. Each listener in the tinnitus group completed the Tinnitus Handicap Inventory (THI) questionnaire (mean: 22.2, min: 4, max: 54, standard deviation: 13.34). Otoscopy was performed to exclude obstruction by cerumen or infections. All participants provided informed consent and all experiments were approved by the Science-Ethics Committee for the Capital Region of Denmark (reference H-16036391).

Apparatus
The psychoacoustical measures (high frequency audiometry, psychophysical tuning curves, tinnitus likeness, adaptive categorical loudness scaling) were implemented as custom software in MATLAB [22]. Stimuli were generated digitially, converted into an analogue waveform (RME Fireface UCX soundcard), amplified (SPL Phonitor MINI) and presented through headphones (Sennheiser HDA 200). The calibration was done by equalizing the transfer function (amplitude and phase) of the transducer using a FIR filter of order 2047. The transfer function was measured using the sound level meter NorSonic Nor139, the artificial ear: G.R.A.S. 43AA-S2 (ear simulator kit according to IEC 60318-1 & -2, with pre-polarized microphone) and the calibrator: Brüel & Kjaer 4230 or 4231 sound calibrator. The wide band typanometry (WBT) and MEMR latencies were measured with the Interacoustics Research Platform and the Titan Suite software (Interacoustics A/S). The Titan research platform was used to measure the WBT and MEMR. The Titan Suite software was used to calculate the latencies.

Tinnitus likeness
Tinnitus likeness was measured for all listeners in the tinnitus group. The loudness and pitch of the individual listeners' tinnitus was measured using a tinnitus likeness (TL) procedure based on [23] using a) pure tones and b) 1/3rd octave wide narrow-band noise with Gaussian amplitude distribution as a probe. In a first step, the listeners were asked to adjust the level of a 1 kHz probe tone to match the loudness of the tinnitus. This was repeated three times. The average level of these three repetitions was used as the probe level for the frequency rating of the tinnitus in the second step. To match the pitch of the tinnitus, probes tones with frequencies between 0.125 kHz and 14 kHz were randomly presented for a duration of 2 s including 50 ms raised-cosine ramps at on-and offset. For each frequency, the listener was asked to rate the likeness of the probe with their tinnitus on a scale between 1 ("not similar at all to my tinnitus") to 10 ("sounds exactly like my tinnitus"). The probe was repeated in intervals with 2 s silent intervals, until the participant did not give an answer. The listener was also given the option "not heard". The maximum of the likeness spectrum was identified and used for loudness matching with tones in the third step: The listeners were asked to match the level of the probe centered at this frequency with the loudness of their tinnitus. This was repeated three times. The average level of these three repetitions was used in step four: To match the pitch of the tinnitus with the narrow-band noise probes, probes with the same center frequencies as in the second step were used.

Adaptive categorical loudness scaling
Loudness growth was measured using an adaptive categorical loudness scaling (ACALOS) procedure [24] for all listeners in the control group and the tinnitus group. The stimulus consisted of a 1/3rd octave wide noise centered at the test frequency (0.25 kHz, 0.5 kHz, 1 kHz, 2 kHz, 4 kHz, 6 kHz) with a duration of 1000 ms including 50 ms raised-cosine on-and offset ramps. The frequencies were presented in random order and the listeners were asked to rate each sound from "Not heard" (corresponding to a value of 1) to "Extremely loud" (corresponding to a value of 50). The maximum level of the stimulus was set to 95 dB HL. In the the first trial, the level of the stimulus was set to 65 dB SPL. The next trial in a run was chosen dependent on the answer given in the previous sound presentation and a continuous function was fit through the measured data points to obtain the frequency specific loudness growth function. For details on the procedure, please see [24] and [25].

Wideband tympanometry and middle ear muscle reflex
Wideband tymapanometry (WBT) and MEMR were measured ipsilaterally using the Titan system (Interacoustics A/S) and a custom software (Interacoustics Research Platform) implemented in [22] for all listsners in the control group and the tinnitus group. To measure the WBT the ear canal pressure was swept between -300 daPa and 200 daPa in descending and ascending direction with a speed of 100 or 300 daPa/s. The data were consistent for the different pump speeds in the same listener when tested in a subset of the listeners. The individual tympanometric peak pressure (TPP) was defined as the peak of the WBT and used for the pressurization before measuring the MEMR. The middle ear muscle reflex (MEMR) was measured using a paradigm based on [26] consisting of a series of clicks and activators (click-activatorclick). The activator in this case was 506ms of white noise with onset and offset of 2.3 ms, respectively, obtained with a Kaiser window. The level of the noise was varied between 75 dB SPL and 105 dB SPL in steps of 5 dB. The transducer was calibrated based on peak-to-peak voltage of a 1 kHz tone. The strength of the reflex was quantified using the change in sound energy between the average of four stimuli with the elicitor and the baseline [27]. This will be referred to as DELTA absorbance. The threshold criterion for presence of the MEMR was chosen arbitrarily to be a DELTA absorbance of 0.03. The latency of the MEMR was recorded with the Titan suite software for pure tone probes with frequency of 500 Hz. The threshold used in this part of the experiment was calculated with the Titan Suite. The latency was calculated at a level of 10 dB above the reflex threshold subject to an upper limit of 100 dB SPL.

Fast high-frequency audiometry, psychophysical tuning curves and tinnitus tuning curves
High-frequency audiometry (HFA), psychophysical tuning curves (PTC), and tinnitus tuning curves (TTC) were implemented using a Bayesian algorithm based on [28]. HFA and PTC were measured for all listsners in the control group and the tinnitus group. TTC were measured only for the listeners in the tinnitus group. The algorithm maximizes the information obtained from each estimate by selection of the measurement parameter to reduce the uncertainty based on the listeners responses. The parameters used in HFA, PTC and TMC are summarized in Table 1.
The HFA was implemented as a 1-interval-2-alternative forced choice (1-I-2-AFC) procedure. One stimulus interval of the HFA contained a series of three tone bursts, each with a duration of 250 ms including 20 ms on-and offset ramps and an inter-pulse gap of 100 ms. The tone bursts had frequencies between 8 kHz and 16 kHz. The listeners were asked to indicate if they heard the tone ("Yes") or not ("No"). The PTC was implemented as a 2-interval-2-alternative forced choice (2-I-2-AFC) procedure. For the PTC, the stimulus contained two noise bursts with a bandwidth of 0.5 octaves and Gaussian amplitude statistics. The noise bursts were generated in the frequency domain by setting everything outside the pass band to zero. The tone bursts were centered at 1 kHz (for the PTC at 1 kHz, PTC1k) or at the tinnitus frequency identified by the tinnitus likeness measure (TL). The probe for the PTC at tinnitus frequency (PTCtf) had a constant level corresponding to the loudness found in the tinnitus loudness matching. If that level was below threshold, the level was set to 5 dB above the value found in the loudness matching experiment. The same frequency and level used for a given listener from the tinnitus group was used for the matched listener in the control group. The probe for the PTC at 1 kHz, had a level of 10 dB above the individual threshold in quiet at 1 kHz. The center frequency of the noise was varied between +/-1 octaves around the the reference frequency. The listeners were asked if the two sounds were different.
The TTC was implemented as a 1-interval-2-alternative forced choice (1-I-2-AFC) procedure. For the TTC, the stimulus was a single 0.5 octave-wide noise burst with Gaussian amplitude distribution and a duration of 2000 ms including 20 ms on-and offset ramps. The noise was centered at the tinnitus frequency. The listeners were asked to indicate whether the tinnitus is audible ("Yes") or not ("No") in the presence of the noise.

Clustering
The collected data were organized into a 35 x 18 matrix with (participant x measures). The MEMR was measured twice, but only one of the two was included in the processing. The decision was based on the WBT. The WBT had to be positive and have the maximum peak in the pressure range between -150 to 150 daPa. If both measurements met these criteria, the first measurement was used. In addition to the strength of the MEMR, the MEMR latency for a tone at 500 Hz was also included in the predictors. The PTCtfs were projected onto categorical outputs: U-shaped (USH), flat (F), other (O). The TTCs were projected onto the categories: USH, F, O or non-maskable (NM). The categorization was conducted by three independent judges, distributing the raw data into the provided categories. The ACALOS was quantified in a single number derived from the mean (through the various frequencies) of the difference between the most comfortable level and hearing threshold. The tonal/noisy and the tone likeness of the tinnitus are information retrieved through the tinnitus likeness test. The first indicate the type of tinnitus and the second the higher likeness rating for tones. The HFA was tested twice. The median of the two tests has been calculated separately and the mean of the two values was chosen as input for the clustering. Age, tinnitus type (monaural or binaural), and the Tinnitus Hearing Impairment score were obtained through direct communication and the questionnaire.
To determine the optimal number of clusters, the silhouette algorithm was used [29]. The silhouette calculates for each observation the ratio of the distance intra-cluster and inter-clusters. The closer the output value of the silhouette is to 1 the better is the clustering. The closer is to 0, the higher is the uncertainty about which cluster the observation belongs to. In this analysis the Gower distance was used. The Gower distance, or similarity, can be calculated for numerical, categorical, logical or text data. In particular, the distance between two observations would be the average of the feature-specific distances in the range [0 1]. If the feature is numeric, the distance will be the ratio between the difference of the two values and the maximum range for that feature. If the feature is categorical, the value will be 1 if the features fall in the same category, 0 otherwise.
After selecting the number of clusters, the algorithm Partitioning around Medoids (PAM) was applied to identify the clusters. The algorithm identifies k objects, medoids, defining the clusters, where k is the number of clusters. The medoids are always defined among the elements to cluster [30]. The strategy of the algorithm is to minimizes the distance between the object to classify and one of the medoids. The distance between the objects to classify and the medoid is given calculating the Gower distance [31].

Random forest
Given the output of the clustering, the next step was to try to understand which features contributed the most to the classification. To be able to select the most informative predictors, a Random Forest algorithm [32] available in MATLAB [22] was used. First, a set of 500 decision tree weak learners has been generated and trained for regression with a Bagging method. Bagging, as well as Boosting, is a method that consists in training more models with the same learning algorithm randomly sampling data-points to generate training data sets for the weak learners [33]. We then used the ensemble regression function that combines the weak learner models with the data to improve the prediction accuracy of the learning. To avoid selecting the settings manually, a Bayesian Optimization (BO) was used to optimize the choice of the hyperparameters [34]. The parameters that were optimized are: method (either Bagging or LSboost), the number of cycles for which the ensemble is trained, the learning rate, and complexity. The importance of the different predictors was calculated for better interpretability of the clusters relative to the used dimensions. To maximize the reliability of the result this routine has been run 50 times and the importance factor evaluated as the mean of all outputs.

Tinnitus likeness
The results of the Tinnitus Likeness were split into two categories: i) The rating of similarity of the participant's tinnitus with the tone that they picked to be the most similar (TonLiken), and ii) the information whether the tinnitus was rated more similar to a tone or to noise (Ton/ Nois). Out of 20 tinnitus participants five (s03, s09, s12, s14, s20) rated their tinnitus to be more similar to noise, twelve rated it to be closer to a tone, and three (s19, s15, s08) rated in the same way the noise and the tone. The TonLiken was very variable across participants ranging from 1.67 (s12) to 10 (s18) on a scale from 0 to 10. The participants whose ratings were below 6 were s01, s09, s12 and s20 and roughly overlap with the group which rated the noise to be more representative of their tinnitus. Participant s01, however could not find any sound which resembled their tinnitus. Individual data can be found in S1, S4 and S5 Figs. The majority of the tinnitus group rated their tinnitus to be more similar to high frequency sounds. Since the PTCtf test was performed presenting a tone whose frequency was supposed to be the most close to the tinnitus, the information for tones is reported. Nine out of 20 participants chose the most similar tone to their tinnitus to be at 14 kHz, two at 12.5 kHz, one at 10 kHz four at 8kHz, two at 6 kHz, one at 3 kHz, one at 500 Hz (Table 2). . Individual data can be found in S6 and S7 Figs. The loudness growth functions were quantified using the lower slope, the upper slope, and the breakpoint of the fitting function. The ACALOS was only performed until 95 dB HL. For some of the listeners the loudness perception didn't reach the maximum rating (50) either for all the frequencies or a subset of them at this stimulus level. Hence, the mean of the difference between the most comfortable level (MCL) and the hearing threshold (HTL) across frequencies was used as an indicator for the presence of hyperacusis. The ACALOS rating ranged from 54.67 dB (s09) to 82.67 dB (c18). The median of the tinnitus group was 68.83 dB and the median of the control group was 75.33 dB. In line with the initial hypothesis, the most comfortable level was higher for the control group compared to the tinnitus group. But the difference was not statistically significant (Fig 2, panel C). Because no clear criteria exist, no binary classification (hyperacusis versus non-hyperacusis) was made before proceeding to the clustering and the MCL-HTL measure was used as input for the clustering. Individual data can be found in S12 and S13 Figs. The MEMR strength was derived as the mean of the Delta Absorbance across frequency for each level for further analysis. In general, the MEMR showed a stronger response (grey) for high stimulus levels compared to low stimulus levels (red). The MEMR shows, however, high variability across listeners. For 105 dB SPL, for example, the minimum MEMR strength was 0.0324 (s13), while the maximum (not considering the outlier c20) was a factor of 20 higher with 0.6661 (c04). Similar patterns were found at the other levels. While some listeners had a strong response, others didn't show any, even at higher levels (Fig 3 panel B). When comparing the tinnitus and control group, there was no significant difference at any level (Wilcoxon rank sum test, p > 0.05). The MEMR was tested twice, but only the less noisy result was kept. The MEMR for s11 resulted in too noisy data and was therefore discarded. One listener (s9) in the tinnitus group had a particularly high compliance of the tympanic membrane, therefore the WBT was tested with the clinical setting before testing the MEMR. The latency of the MEMR was calculated with the clinical setting of the Titan (Interacoustics A/S) at the frequency of 500 Hz. In the following analysis it will be referred to as 'RL500'. The results for this variable ranged from 35 ms to 239 ms (mean 127.31 and standard deviation 51.60 ms). There was no significant difference between the tinnitus and the control group (Wilcoxon rank sum test, p> 0.05).

Wideband tympanometry and middle ear muscle reflex
Fast high-frequency audiometry, psychophysical tuning curves, and tinnitus tuning curves   probability to be within one SD in the 50% contour graph. The circles and the crosses represent the responses to the trials. Circles represent trials that have been heard ("yes" to the question "Did you hear the sound?"), and crosses represent sounds that have not been heard ("no" to the question "Did you hear the sound?"). The standard audiometry (125-8000 Hz) for the two listeners is presented in the inset. The two listeners in panels A and B represent the differences found in the group of audiometrically normal hearing listeners (i.e., in the range from 125 Hz to 8 k Hz). Some listeners showed thresholds <15 dB HL for the whole frequency range up to 16 kHz, while others showed a strong decline in sensitivity at frequencies higher than 8 kHz. Across subjects, the HFA results showed high variability. The HFA was measured twice, and for each measurement the median across frequencies was calculated. The mean of the two medians was used for further analysis ('MMedian'). The intersubject standard deviation of the mean of the two median values of the two HFA measurements was 13.99 (max:41.42 min:-4.22). This was expected since the age range of the participants varies from 22 to 60 years old. A strong correlation of the HFA with age was found (Spearman correlation, 0.74; p = e-7). The test-retest reliability of the median of the two measurements was quantified using the ICC measure [35,36]. The repeatability of the HFA results was strong (ICC = 0.989). The median of the HFA measure for tinnitus listeners was 7.815 dB HL, while for controls was 3.25 dB HL. The difference was not statistically significant (Wilcoxon rank sum test, p> 0.05). Fig 5 shows the PTC at 1 kHz (PTC1k, panel A), the PTC at tinnitus frequency (PTCtf, panel B) and the tinnitus tuning curve (TTC, panel C) for the same listener. Individual data can be found in S2, S3, S14-S18 Figs. The vertical lines indicate the frequency of the probe frequency for the PTCtf and TTC derived from the the tinnitus (the output of the tinnitus likeness). This listener had a tuned response at the tinnitus frequency (panel B). For many subjects, however, the PTCtf was challenging, especially for those with tinnitus at very high frequencies (12,5 or 14 kHz). Therefore, the data above 12,5 kHz were considered unreliable and therefore excluded from the analysis. The PTCtf results have been classified either Tuned or Non-Tuned (T, NT) by three independent judges. Out of 35 participants, 15 had a tuned response (s01, s03, s04, s05, s10, s12, s14, s16, s17, s19, s20, c14, c15, c16, c18) and the rest had non-tuned response.
The TTC (panel C) was reported to be challenging especially by the participants describing their tinnitus as bilateral. All tests were conducted in one ear only. Hence, the masker was only

PLOS ONE
provided in one ear. It is likely that the difficulty in the disassociation between the two ears made the test harder for this group of listeners. For each listener, the ear in which the tinnitus was subjectivity considered stronger was selected. If the tinnitus was self-reported as equally bothering, the choice was random. Participants reporting their tinnitus as central (s01, s07, s20) were considered in the same way as the one reporting it on both sides (s03, s04, s05, s06, s10, s11, s12, s13, s14, s17, s18, s19), taking into account a potential different behaviour of the TTC. Fewer participants had monoaural tinnitus (s02, s08, s09, s15, s16). This information was reported as 'Mono/Bi'. The TTC results have been classified as Tuned, Non-Tuned or Non-Maskable (T, NT, NM) by three independent judges. Out of 20 tinnitus participants 4 resulted to have Non-Maskable tinnitus (s01, s02, s10, s20), 6 resulted to have Tuned TTC (s07, s13, s14, s17, s18, s19) and the rest Non-Tuned.

Clustering
The results of the clustering differed depending on the inclusion or the exclusion of the tinnitus-specific predictors.
The silhouette algorithm resulted in similar scores for 2 and 5 clusters for inclusion of all predictors. Hence, the clustering algorithm was performed with 2 and 5 clusters. The results for 2 clusters showed a clear separation of the tinnitus-and the control group (not shown). Of the tinnitus listeners, only subject s15 was included in the control group, despite suffering from tinnitus. The clear separation between the listener groups is anticipated because all predictors, including those exclusive to the tinnitus group, where available to cluster the data.
Although trivial, this outcome shows the feasibility of the suggested clustering approach for the provided input data. The results for the 5 clusters (Fig 6) showed one outlier in the control group (c20) which presents the only element in Cluster 5. One cluster was found with all the remaining listeners from the control group plus one tinnitus participant (s15) as part of Cluster 4. The rest of the tinnitus group divided into 3 clusters (Cluster 1,2, and 3). It is interesting to note how tinnitus participants grouped in three different clusters. However, also in this case, since the clustering included tinnitus specific measures (Tinnitus Likeness, TTC, Mono/ Bi, Ton/Nois) that were absent for controls (NaN) the separation between controls and tinnitus participants was anticipated.
To avoid biases due to the tinnitus-specific predictors, an additional clustering analysis was performed on the participants excluding the measures exclusive to the tinnitus group. Based on the silhouette algorithm with the predictors excluding the tinnitus-specific parameters, the clustering algorithm was run with 3 clusters.
The result of the clustering is shown in Fig 7. One cluster contained only the outlier c20, as in the clustering results with 5 clusters (see Fig 6). The other two clusters contain both control and tinnitus participants in different ratios. The first cluster contains the majority of the controls (10 out of 15) and about the half of the tinnitus participants (9 out of 20). The second cluster contains only few controls (4 out of 15) and about the half of the tinnitus participants (11 out of 20). Table 3 shows the means of the numerical predictors and the categorical predictors for each cluster.
For the 5-cluster analysis, clusters 1-3 (which contained all but one listeners from the tinnitus group) varied mainly in terms of HFA, MEMR95, RL500 and the PTCtf. There is consistency that the on average youngest listeners (Cl1) had the lowest HFA thresholds and the strongest MEMR95. The on average oldest listeners (Cl3) had the highest HFA and the the weakest MEMR95. The longest latencies RL500 were found in Cl2 which contained exclusively listeners with exclusively non-tuned PTCtf. Clusters 4-5 contained all listeners from the control group plus one listeners reporting tinnitus (s15). Cluster 5 contained only one listener (c20). All values for Cl4 were within the range of clusters 1-3 which indicates that it was the For the 3-cluster analysis, cluster 3 contained a single listener (C20). The main differences between clusters 1 and 2 were the PTCtf with only tuned and non-tuned characteristics,  Table 3. Resulting parameter values of the clustering analysis with five (upper rows) and three (lower rows) clusters. For each cluster (Cl, first column), the following parameters are shown: HFA mean (dB HL), MEMR mean, ACALOS mean, age mean, RL500 mean (in ms), PTCtf, PTCtf summary, THI mean, TonLik mean, T/N summary, Mono/Bi (M/B) summary, and TTC summary. In the 5-cluster analysis, all measures were included, while in the 3-cluster analysis the tinnitus measures were excluded. respectively. Listeners in Cl2 also had longer RL500 latencies compared to the listeners in Cl1. The main differences between clusters 1-2 and cluster 3 are the low HFA for c20 paired with a strong MEMR95. Overall, the listeners could be grouped in to clearly separable clusters. But even for a relatively low number of clusters (3 and 5), the number of listeners in each group became relatively low. Hence, the small sample size might have contributed to the heterogeneity of the members (control / tinnitus) in each cluster.

Random forest
The analysis of importance (Fig 8) showed that for the five clusters analysis the most important predictors were: tonal or noisy tinnitus (0.1297), TTC (0.1133), and monaural or binaural tinnitus (0.1127). The next factors had an importance of around 0.03 and were more variable, despite representing the mean of 50 repetitions of the analysis. Thesese factors included HFA (MMedian), MEMR at 95 dB (MEMR95), ACALOS and Likeness of tinnitus to a tone (TonLiken). The three most important measures were in the Only-Tinnitus category, that means applicable only to the tinnitus group. Not surprisingly, the most important measures are tinnitus-only measures. These easily separate the control (that lack these information) from the tinnitus group. However, The TonLiken predictor is not among the highest rated, but also tinnitus related.
The importance analysis with the clustering without tinnitus specific measures is shown in Fig 9. In this analysis the PTCtf is the most important predictor (0.1127) and all the others

PLOS ONE
except the latency of the reflex on a comparable level (around 0.02). The maximum level of Importance, however, for both cases is around 0.1-0.13.

Discussion and conclusions
In the present study, various data were collected with the goal to have multiple sources of variability as input into a cluster analysis. The test groups were compiled based on characteristics as homogeneous as possible within the two groups (tinnitus and controls) but also between the groups to reduce noise in the data. The analysis showed no significant difference between the tinnitus and the control group.
According to our hypothesis, a group of participants should have shown some characteristics usually linked to cochlear synaptopathy (CS). Our hypothesis related to CS was: i) increased thresholds at high frequencies [20], ii) decreased activation of the MEMR [26], iii) a lower MCL. Considering the first cluster with five groups, the cluster showing this characteristics was cluster 3. Interestingly, cluster 3 was also the one with higher average age ( Table 3). The type of tinnitus was mainly tonal for the first two clusters and noise-like for the cluster whose characteristics were closer to CS. Similarly, evidence exist that noise-like tinnitus can be more common in older patients with higher-level hearing loss [37]. Also the mean of the THI questionnaire responses was higher for Cluster 3. However, the three tinnitus groups were relatively small (Cluster 1 = 8 participants, Cluster 2 = 5 participants, Cluster 3 = 6 participants). Due to the small size of the clusters, no statistical analysis was run to assess the significance of these differences.  We hypothesized that participants with noise-like tinnitus would show a different PTCtf compared to the tinnitus participants with tonal tinnitus. In fact, the tonal tinnitus perception could be a distraction when trying to distinguish a trial with or without probe tone. The results showed that cluster 1 and 3 were the ones with mainly tuned (T) PTCtf, which we can consider to be consistent with PTC in normal hearing listeners. Cluster 1 was composed of participants having mainly tonal tinnitus, while cluster 3 was mainly composed of listeners with noise-like tinnitus. Therefore, the first result was not in agreement with the initial hypothesis. In addition to this, Listeners in Cluster 3 had a mean tinnitus frequency lower that the ones in Cluster 1 (7.67 kHz for Cluster 3, 11.06 kHz for Cluster 1), which made the task even harder. Listeners reported the task to be more demanding with higher frequencies when compared with the same test performed at 1 kHz. Cluster 2 and 3 are in line with the hypothesis that PTCtf shows a shape similar to non-tinnitus sufferers for noisy tinnitus than for tonal tinnitus. This is, however, inconsistent with Cluster 1.
For what concerns the monaural or binaural characteristic of tinnitus, the distribution was uniform across the three groups, so there was no group having a single output for this predictor. We hypothesized that TTC would have been mainly Non-tuned (NT) or Non-maskable (NM) for binaural tinnitus but it is not the case from the data, although participants with binaural tinnitus reported the high difficulty in disentangling the perception in the two ears.
It needs to be underlined that in this experiment HFA, PTC and TTC were performed with a Bayesian procedure. This allowed to decrease the duration of the experiment and to get a continuous result. However, the PTC at very high frequency was perceived as a very hard task, giving noisy results. Therefore, we decided to handle the results of the last two tests (PTC, TTC) grouping them into categories and this might have created some bias in the clustering. Exemplary outcomes of the PTC are shown in Fig 10. Panel A indicates the unreliability of the data for listeners with a high-pitched tinnitus. Panel B illustrates a listener where the tuning of the PTC likely seemed to be affected by the presence of the tinnitus. Panel C illustrates a listener where tinnitus frequency and PTC seemed independent from each other. These data illustrate the strong variability in this outcome measure. An automated procedure with a higher number categories might refine the clustering result, but will require a much higher number of listeners for the analysis.
In the second clustering we see, in fact, that the PTCtf takes over the clustering. However, the TTC and PTC can give interesting insights about the maskability of the tinnitus and the tuning of the cochlea, respectively, and shed light on the hearing mechanisms connected to tinnitus. The second clustering without tinnitus measures is less informative and the mean differences across groups are not very broad (Table 3). However, what can be extrapolated from this result is that tinnitus participants had no homogeneous and distinct features with respect to the controls. With more participants we would have expected to see a mixed group with tinnitus and controls having characteristics leaning towards CS.
The reason for the different findings across studies might lie in the differences how the measures were assessed. One study [18] measured MEMR using a threshold criterion of ". . .reduction in compliance of 0.02ml or greater with appropriate morphology and no evidence of significant measurement artifact" [18] using a clinical paradigm with a tonal probe and coupler calibration. Another study [17] and the present study used a broadband elicitor and an in-situ calibration method. [18] and the present study used ipsilateral elicitors, while [17] used contralateral elicitors. The metric used in [18] was a threshold, while [17] and the present study used a metric reflecting the MEMR strength. In [18] and in the present study, listener were assessed with audiometry up to 14 kHz (16 kHz in the present study), while [17] only screened the audiogram up to 8 kHz and included listeners in the tinnitus group up to 30 dB HL. In the present study, no significant differences were found in MEMR strength between the tinnitus group and the control group at any level measured (Wilcoxon rank sum test, p > 0.05). The authors in [17] argue that their paradigm reduced the influence of the medial olivocochlear (MOC) reflex by the low rate of click presentation. The paradigm used in [18] and in the present study is, however, more consistent with the most sensitive measure of synaptopathy in mouse [38]. Hence, it can not be excluded that different variations of MEMR strength variability across listeners and mixed effects of MOC reflex and high-frequency sensitivity loss could have an effect on the results. While the studies by [17] and [18] rely on a pairwise comparison of outcome measures, the grouping of the elements in the high-dimensional predictor space into clusters might be robust against the introduced variability and more sensitive to consistencies in correlations across multiple predictors. This benefit might, however, be jeopardized by the need for larger sample numbers to allow proper statistical estimates of the distributions of the data.
There might be additional factors contributing to the creation of clusters in the present study. One factor not explicitly controlled for is gender of the listeners. Previous studies on gender effects on comorbidities of tinnitus found none of weak correlations between gender and suggested tinnitus-related comorbidities. A recent study evaluated tinnitus severity using THI along with multiple indicators of psychiatric distress in a sample of 245 listeners [39]. They found no gender effect on tinnitus severity in their sample, but weak correlations of gender and a number of psychiatric conditions like depression (females) or stress (males). Another study [40] found similar results in a sample of 107 listeners with correlations only between gender and psychiatric comorbidities, but not direct measures of tinnitus like THI. The sample size in the present size is considerably lower and does not allow for a statistical analysis of similar correlations. However, the numbers in the various clusters were rather balanced and indicate no trend towards an impact of gender on the results (11/4 and 11/8 male/ female in the 3-cluster analysis in the two clusters containing more than one listener). It might be interesting to evaluate putative correlations between the measures used in the present study and psychiatric diseases by addition of another dimension in the clustering analysis.
Subject 15 (s15) became an outlier in the 5-cluster analysis. It was assigned into a cluster with listeners from the control group, despite reporting tinnitus. Even though only a individual listener, this results warrants some speculation. Based on the results, it is challenging to link the assignment of the listener to similarities to the control group. But it might be interesting to speculate about the exclusion of this listener from the tinnitus-dominated cluster, despite its proximity in the projection on the first two principle components. The listener reported a broadband tinnitus (TL, panel O), similar to a high number of other listeners in this group. It showed a relatively flat HFA below (20 dB HL in the measured range) which was highly similar to a large number of other listeners in the group. Also loudness perception (ACALOS) and the psychophysical tuning at 1 kHz (PTC1k) seemed similar to a high number of other listeners. This listener differed from the other listeners in the tinnitus group in terms of the MEMR. Compared to most other listeners showed the MEMR of s15 (panel N) a relatively flat profile with little variation with intensity. At low frequencies, there is a tendency towards a rather constant delta-absorbance which is different to most other listeners. for other listeners, the delta absorbance tends to be more negative between 0.5 kHz and 1 kHz compared to frequencies below 0.5 kHz. Only at the highest level measured (105 dB) was the MEMR more in agreement with that of the other listeners with negative values up to about 1 kHz and positive values above 1 kHz. The psychophysical tuning curve at tinnitus frequency (PTCtf) showed a, compared to the other listeners, relatively shallow U-shape. Also in the tinnitus tuning curve (TTC, panel M), this listeners showed a concave shape with a maximum at the tinnitus frequency. The relatively unique shape of the MEMR indicates a reduced sound level sensitivity of the mechanism modifying the sound transduction into the cochlea. This mechanism was only visible in the data at the highest level measures. The PTCtf and the TTC might suggest that there is an interaction of the acoustic probe with the tinnitus mechanism. In the PTCtf, the U-shape seems broadened due to the presence of the tinnitus. In the TTC, the concave shape and the peak at the tinnitus frequency might be a sign for entrainment of the tinnitus frequency to the external probe tone. All these indications are derived from the data of a single listener. It might be interesting to confirm similarity across these measures in a group of listeners selected based on a single of the measures. It needs, however, to be noted that the relatively small sample size might lead to a poor estimate of the high-dimensional distributions derived in the predictor space. This undersampling might hence lead to noise in the assignment to clusters and hence the appearance of this outlier.
In conclusion, the use of clustering for the categorization of tinnitus participants revealed that tinnitus participants with normal hearing can still have very different outputs in many tests. Confirming, in other words, that tinnitus is a very diverse phenomenon for causes, comorbidity and hearing disorders. However, the value of the silhouette function, showing the validity of the classification, was not very high. This means that to confirm the results of this preliminary study a higher number of participants needs to be recruited. Finally, clustering with a big data set, potentially including dimensions of psychiatric comorbidities could be the starting point towards a refined categorization of tinnitus.  (from s01 to s10). Each participant has been tested twice except participant s17, which was struggling with the task. Each label on the top right of each panel is composed by two parts: the first is an incremental index and the second one is the label for the specific participant. On the y-axis are presented the hearing thresholds and on the x-axis the frequencies (from 8 kHz up to 16 kHz). The central lighter line represents the threshold. The two darker lines represent the limits of the one standard deviation shift. Circles represent trials that have been heard, and crosses represent sounds that have not been heard. The results for participants s17 and s19 are missing due to technical problems. In each panel on the y-axis is reported the level of the masker, while on the x-axis is reported the frequency of the masker. The target is a tone of frequency 1 kHz and level 5 dB higher than the individual threshold. The central lighter line represents the PTC. The two darker lines represent the limits of the one standard deviation shift. The circles represent the trials in which the target has been heard, and crosses represent the trials in which the target has not been heard. (EPS) S15 Fig. Individual results for psychophysical tuning curve at 1 kHz (control group). The figure represents the individual results of the Psychophysical Tuning Curve (PTC) at 1 kHz for all the control participants (from c01 to c20, panel A-R). The numeric part of the labels for each participant are equal to the labels of the tinnitus participant matched (in hearing loss and age). In each panel on the y-axis is reported the level of the masker, while on the x-axis is reported the frequency of the masker. The target is a tone of frequency 1 kHz and level 5 dB higher than the individual threshold. The central lighter line represents the PTC. The two darker lines represent the limits of the one standard deviation shift. The circles represent the trials in which the target has been heard, and crosses represent the trials in which the target has not been heard.